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Abstract 

Tubulins are a family of GTPases that are key components of the cytoskeleton in all eukaryotes and are distantly 
related to the FtsZ GTPase that is involved in cell division in most bacteria and many archaea. Among prokaryotes, 
bona fide tubulins have been identified only in bacteria of the genus Prosthecobocter. These bacterial tubulin genes 
appear to have been horizontally transferred from eukaryotes. Here we describe tubulins encoded in the genomes 
of thaumarchaeota of the genus Nitrosoarchaeum that we denote artubulins Phylogenetic analysis results are 
compatible with the origin of eukaryotic tubulins from artubulins. These findings expand the emerging picture of 
the origin of key components of eukaryotic functional systems from ancestral forms that are scattered among the 
extant archaea. 

Reviewers: This article was reviewed by Gaspar Jekely and J. Peter Gogarten. 



Findings 

Tubulins comprise a distinct family of GTPases that are 
highly conserved among eukaryotes and are the major 
components of microtubules, an essential part of the 
eukaryotic cytoskeleton [1,2]. All eukaryotes encode 
multiple, paralogous tubulins that evolved through a ser- 
ies of gene duplications at early stages of eukaryote evo- 
lution as well as many subsequent, lineage-specific 
duplications [3]. Among prokaryotes, the only bona fide 
tubulins have been identified in several bacteria of the 
genus Prosthecobacter [4] in which they form microtu- 
bule-like sturctures closely resembling eukaryotic micro- 
tubulues [5]. The tubulins of Prosthecobacteria show 
high sequence and structural similarity to eukaryotic 
homologs, and given their extremely narrow distribution 
among prokaryotes, are thought to have evolved via hor- 
izontal transfer of a eukaryotic tubulin gene to an ances- 
tor of this group of bacteria [6,7]. The great majority of 
bacteria and many Archaea encode the FtsZ protein 
which plays a central role in cell division of most bac- 
teria and many archaea and is a prokaryotic homolog of 
tubulin [8]. Both FtsZ and tubulin undergo GTP- hydro- 
lysis-dependent cycles of polymerization and depolymer- 
ization, and are mechanistically analogous [9,10]. 
However, FtsZ and tubulin share extremely weak 
sequence similarity, so that the homology has become 
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apparent only through comparison of crystal structures 
of these proteins [11]. Recent progress in genome 
sequencing and comparative genomics has revealed 
numerous previously unrecognized members of the 
FtsZ-tubulin protein superfamily [12,13]. These proteins 
considerably expand the range of sequence divergence 
adoptable by the FtsZ-tubulin fold but none of them are 
candidates for the role of direct prokaryotic ancestors of 
tubulins. In the absence of such candidates, it is gener- 
ally assumed that tubulin evolved from FtsZ at the onset 
of eukaryote evolution, and this evolution engendered 
extreme sequence divergence associated with the shift in 
function [14]. Here we describe bona fide tubulins 
encoded in two recently sequenced genomes of Thau- 
marchaeota. Phylogenetic analysis suggests that these 
archaeal tubulins could be the direct ancestors of eukar- 
yotic tubulins, a conclusion that has general implications 
for the evolution of the key functional systems of the 
eukaryotic cell. 

Archaeal tubulins 

In the course of a systematic search for archaeal homo- 
logs of signature eukaryote proteins, we found that the 
best archaeal BLAST hits for tubulins were two closely 
related proteins from the recently sequenced genomes 
of Thaumarchaeota, Candidatus Nitrosoarchaeum lim- 
nia [15] and Candidatus Nitrosoarchaeum koreensis 
[16]. Eukaryotic tubulin sequences, in particular those of 
gamma-tubulins, aligned with these proteins over a 
region of approximately 300 amino acid residues, with 
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e-values below 10~ 13 . Although the similarity between 
eukaryotic tubulins and FtsZ-like proteins from other 
archaea were also statistically significant, these align- 
ments only covered regions of approximately 100 amino 
acid centered at the GTP-binding loop, with most signif- 
icant e-values of approximately 10~ 8 . Reciprocal BLAST 
searches using the Nitrosoarchaeum tubulin homologs 
(hereinafter artubulins) as queries showed significantly 
greater similarity to eukaryotic tubulins than to FtsZ 
proteins. 

These observations prompted us to perform a detailed 
phylogenetic analysis of the tubulin protein family. To 
this end, we compiled a representative set of eukaryotic 
and bacterial tubulins and constructed a multiple align- 
ment of the sequences of these proteins with the artu- 
bulins (Figure 1; see Additional File 1 for the complete 
alignment). Examination of the conserved sequence 
motifs in the tubulin/FtsZ superfamily reveals several 
amino acid residues that are common to the tubulin 
family including artubulins but to the exclusion of FtsZ 
(Figure 1). The presence of the apparent synapomor- 
phies is best compatible with a common origin of artu- 
bulins and the rest of the tubulin family. 

The multiple alignment of the tubulin/FtsZ superfam- 
ily (see Additional File 2) was employed to build maxi- 
mum likelihood phylogenetic trees using FtsZ proteins 
as the outgroup. In the resulting phylogenetic tree, the 
artubulins form the sister group to all eukaryotic and 
bacterial tubulins (Figure 2A). In contrast, the Prostheco- 
bacter tubulins were the sister group of the eukaryotic 
alpha/beta tubulin branch. Furthermore, this branch 
included two distinct tubulins that we identified in par- 
tial genomic sequences of the giant gamma proteobac- 
terium Beggiatoa (Figure 2A). Thus, all available 
bacterial tubulin sequences grouped within the eukaryo- 
tic tubulin family. Constrained tree analysis showed that 
alternative tree topology, in which the artubulins 
grouped with bacterial tubulins, could be rejected at a 
statistically significant level; grouping of artubulins with 
different families of eukaryotic tubulins could not be 
similarly rejected (with one exception) although all alter- 
native topologies showed lower likelihood than the tree 
in Figure 2A (see Additional File 3). These findings 
appear to be best compatible with a scenario in which 
the artubulins are direct evolutionary ancestors of the 
eukaryotic tubulins whereas bacterial tubulins originated 
as a result of horizontal transfer of eukaryotic alpha- 
beta tubulin genes into at least two bacterial lineages. 

Partially conserved genomic neighborhoods of archaeal 
and bacterial tubulin genes: functional and evolutionary 
implications 

Notably, in the two Nitrosarchaeum genomes the tubu- 
lin gene is located next to the Snf7 gene which encodes 



one of the subunit of the ESCRT-III complex (Figure 
IB). The ESCRT-III complex is conserved in all eukar- 
yotes and is involved in intracellular membrane remo- 
deling [17]. Recently, archaeal homologs of ESCRT-III 
subunits have been identified and shown to function as 
an essential component of the cell division apparatus in 
some archaea, primarily from the phyla Crenarchaeota 
and Thaumarchaeota [18-22]. In particular, it has been 
shown that Nitrosopumilis maritimus, a thaumarchaeon 
that belongs to the same family, Nitrosopumillaceae, as 
the tubulin-encoding Nitrosoarchaeum, employs 
ESCRT-III as the cell division machinery despite the 
presence of FtsZ [23]. In addition to artubulins, the two 
Nitrosarchaeum genomes encode regular FtsZ proteins 
that show very close sequence similarity to the Nitroso- 
pumilis FtsZ (Figure 1). However, given the above data 
on the non-essentiality of FtsZ for cell division in Nitro- 
sopumilis and the fact that artubulin and Snf7 genes are 
colocalized and possibly coexpressed in Nitrosoarch- 
aeum (Figure 2B), it can be predicted that artubulin and 
ESCRT-III cooperate in cell division in these organisms. 

The genomic neighborhood of artubulin- Snf7 in Nitro- 
soarchaeum has a readily detectable counterpart in 
Nitrosopumilis: a block of genes including those for 
artubulin and Snf7 appear to be inserted into the con- 
served neighborhood between the genes for a Superfam- 
ily 2 helicase and CMP/dCMP deaminase (Figure 2B). 
This relationship between the genome organizations of 
Nitrosoarchaeum and Nitrosopumilis suggests the diver- 
gence between artubulin-encoding and artubulin-lacking 
Thaumarchaeota involved rearrangement, possibly asso- 
ciated with horizontal gene transfer. The genes encoding 
bacterial tubulins are embedded in a completely differ- 
ent genomic neighborhood; however, a parallel exists 
with artubulins in that the tubulin genes in Prostheco- 
bacter appear to have been inserted into a neighborhood 
that is partially conserved in distantly related Verrucomi- 
crobia that lack tubulins (Figure 2B). Moreover, similarly 
to Nitrosoarchaeum, some Prosthecobacter genomes 
encode both tubulin and FtsZ [24]. 

Implications for eukaryogenesis 

Recent comparative genomic research on the origin of 
eukaryotes has revealed an unexpected pattern of 
archaeo-eukaryotic evolutionary relationship. Likely 
ancestors of the key functional systems of the eukaryotic 
cell have been shown to be scattered among diverse 
extant archaea. The cases in point include DNA poly- 
merases [25], RNA polymerase subunits [26], various 
molecular complexes involved in membrane remodeling 
and cell division [20], and the ubiquitin signaling system 
[27]. The ubiquitin case appears particularly striking: a 
single archaeal genome, Caldiarchaeum subterraneum, 
that is likely to represent a distinct phylum [28], 
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95 SLEGTDLVFITAG! 

3 8 ALEGADLVFITAG! 

4 0 ALKGADMVFVTAG) 
3 9 ALEGADMVFIAAG1 

3 8 ILRDTDMVFISAGLI 

4 5 ALQGANMVFITAG1 

5 0 VLQDTHMVFITAG 




PIVAEVAKEMG 
APWADIAKRLG 
PVIAQIAKDLG 
PVVAEVAKDLG 
PVIAKTAKEMG 
PVIAKIARELG 
.SPVIAKIAKEMG 



19 SYNAARSLATLVKEADNVILFDNSAWKNE 
19 SLNAARSMI SLLKYVDNLILVDNGAWKFE 
19 TLNAARSFQTFVREVDNLLVFDNDAWQKT 

19 TLNAARSFQTFVREVDNLLVFDNDAWRKT 

20 TINAAITIDKLSKWDS I IAIDNNKLKES 
5 6 RTNAEAGLERLRDVADTVIVVPNDKLIEV 

4 0 LLRAEEGLEQLSASADSVIVLDNNRLIKY 

3 8 TVKAEEGLEKLREKADSI IVLDNNRLLDY 

31 MKKADEGIARMSEVCDTVII IDNNKLLDL 
22 KRIAQAGLEELKNYTNTSWVDNNKLLDL 

4 9 YKLAQEGIRQLRKWADTWLISNDKLLKL 

50 IEKAKAGIEMLLEYSDTVII IQNDKLKEL 

37 LRNAQWGLARLEETTDTHIVIPNDKLLE I 

51 RANAEAGLERLREVADTVIVIPNDRLLEV 
4 6 MKNAMEGLERLKQHTDTLWIPNEKLFEI 
4 7 RENAE RGLEKLQSAADTVIVIPNDKLLEV 
4 4 REIAMKGLAKLASVSDTIWVNNDKILEI 

32 LQIAWEGIRRLTEFSNTTVILDNNKLFEV 
60 MKNAAEGIRKLVKNSDAAIVIPNDKLIEK 

38 SNQAENGIAALRESCDTLIVIPNDRLLQM 
75 AKQAEEGINALQSRVDTLIVIPNNQLLSV 

18 MRAAEEGIKKLKERVDAMWVQNDRLLSA 
20 QLQAAGG I SAMKEAVDTL I VI PNDRI LE I 

19 MAFAEQGITELSKHVDSLITIPNDKLLKV 
18 MEKALKGLEKLKESSDAYIVIHNDKIKEL 
25 MMLAE RGIEKLRTHSDTVIVIPNQNLLSV 
30 LKKAIEGLKKLRKHVDTLIKISNNKLMEE 



14 TVSQTNSLVAMVMAAST 24 0 

14 SFSQINQLVS TIMS AST 240 

14 TFAQ TNSLVSTVMSAST 24 0 

14 TVEQMNGLVS TVMAAS T 2 42 

13 SVDNINSLVSTIMAATT 240 

41 PFDKMNSLIAHLLSHIT 269 
4 0 PFDAMNNIVANLLLSLT 27 0 

42 PYDSMNAVVAQLLSNLT 2 61 

2 7 SLIDMNRS IAKSLVGVL 22 9 

23 SLAAMNSHIASCLAGLL 230 

2 3 SLSNMNTYIADCVAGLL 233 

19 SLEAMNSYIARSLAGI I 22 9 

4 9 DLSNVNEYIGSCLMNLF 255 

76 DLTNINQYIASNLFNLL 283 

10 SLNDINQVMSTNLASFF 220 
17 SFGVRGRVLGRAGESRV 231 
16 SFRDLNQVLAHQLGSVF 23 9 
16 SFRDVNQVIAHQLGSVF 23 9 
16 SISDINKVISHQLASVL 232 
16 SFRDVNQVIAHQLGSVF 23 9 

12 CYNIANDAIMHVVSSLH 187 

12 CYDIANSAIMHVVESLS 187 

12 CHGMTDSAVLYMAGSLS 184 

12 CFSITNSAIVEVISSIS 191 

12 AFDLVYSKIALS ISYLF 192 

6 AFDRLNDE IVRRFGLLF 197 

6 SFAKINEE IVRRLALLA 201 

6 GYEEINEAIATRFGVLF 198 

6 GYDEINEE IVNRFGVLF 198 

6 AYERINYTIVERIASLL 192 

4 6 AFKVSDEVLMRAVKGIT 233 

31 AFSVMDQLIAETVKGI S 217 

2 9 AFSVMDQI IAETVKGI S 215 

22 AFKVADEI IAQAVKGIT 208 

12 AFSLADEI ISNMIQSIT 199 

39 AFMIADMTLAVMVKGIA 22 7 

37 AFRFADEI IARMVKGIV 22 7 

2 7 AFKLSDEVLANTIKKTT 214 

41 AFKVADE I LMRAVKGI T 22 8 

3 6 AFKVADEVLINAVKGLV 22 3 
37 AFMVADEILGRAVKGIT 22 4 
34 AFFISDEI VARAVK G VV 221 
2 3 AFAL SNE LVAQT VS GVV 213 
50 AFKFEDEVISTGIKGIT 237 

2 9 AFRSADEVLLNGVQGIT 215 

6 6 AFRVAD D I LRQGVQ G I S 253 

9 AFLIADRVLYHGVKGIT 190 

11 AFREADNVLRQGVQGIS 198 
10 AFGAANDVLKGAVQGIA 197 

9 AFKEVDSVLSKAVRGIT 194 

16 TYLVADDLLRKSVQSIS 203 

21 AFLKADETLHQGVKGIS 20 8 



Figure 1 Conserved sequence blocks in the tubulin/FtsZ superfamily. The conserved blocks are separated by numbers which indicate the 
length of less well conserved sequence segments that are not shown (see Additional File 1). The alignment columns are colored on the basis of 
the respective position conservation throughout the superfamily: yellow background indicates hydrophobic residues (ACFILMVWY), red letters 
indicate polar residues (DEHKNQR), and green background indicates small residues (ACGNPSW). Asterisks indicate amino acid residues that are 
conserved in the majority of the tubulins including artubulins but not in the majority of the FtsZ sequences. Each sequence is denoted by the 
corresponding taxon abbreviation followed by the species abbreviation and GenBank Identification (Gl) number. Taxa abbreviations: A, Archaea; 
B, Bacteria; E, Eukaryota; Ac, Crenarchaeota; Ae, Euryarchaeota; An, Nanoarchaeota; At, Thaumarchaeota; Bv, Verrucomicrobia; Ec, Alveolata; Ek, 
Euglenozoa; El, Fungi/Metozoo group; Ep, Viridiplantae; Eq, Heteroloboseo. Species abbreviations: Arath, Arobidopsis thaliana; Chlre, Chlomydomonos 
reinhardtii; Danre, Danio rerio; Drome, Drosophila melanogaster; Galga, Gollus gallus; Mondo, Monodelphis domestica; Musmu, Mus musculus; Naegr, 
Naegleria gruberi; Naneq, Nanoarchaeum equitans Kin4-M; Nemve, Nematostella vectensis; Oman, Ornithorhynchus anatinus; Parte, Paramecium 
tetraurelia; Phypa, Physcomitrella patens; Plakn, Plasmodium knowlesi strain H; Plavi, Plasmodium vivax Sal-1; Prodeb, Prosthecobacter debontii; 
Prodej, Prosthecobacter dejongeii; Prova, Prosthecobacter vanneervenii; Sacce, Saccharomyces cerevisiae S288c; Strpu, Strongylocentrotus purpuratus; 
Tetth, Tetrahymena thermophila; Triad, Trichoplax adhaerens; Trycr, Trypanosoma cruzi; Xenla, Xenopus laevis. 
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Figure 2 Archaeal, bacterial and eukaryotic tubulins. (A) Phylogenetic tree of the FtsZ-tubulin superfamily. The tree is rooted by FtsZ 
proteins; TreeFinder/Molphy bootstrap values are indicated for major branches. The sequences are denoted as in Figure 1. Asterisks mark 
diverged ciliate tubulins [12]. (B) Genome neighborhood of bacterial and thaumarchaeal tubulins. Genes are marked as follows: A, bacterial 
tubulin A; B, bacterial tubulin B; C, tetratricopeptide repeat protein referred to as bacterial kinesin light chain in ref (PMIDS 12486237 and 
1 7942428); T, thaumarchaeal tubulin; S, Snf7; 1, putative serine/threonine kinase; 2, pyruvate phosphate dikinase; 3, aspartate aminotransferase; 4, 
response regulator with a HTH DNA-binding domain; 5, glucose/sorbose dehydrogenase; 6, cysteine synthase; 7, DEAD/DEAH box helicase; 8, 
Major facilitator superfamily MFS 1; 9, TATA-box binding protein; 10, zinc-binding CMP/dCMP deaminase; 11, DNA polymerase I; 12, conserved 
hypothetical protein; 13, triosephosphate isomerase; and 14, AsnC family transcriptional regulator. Syntenic regions between Nitrosorochoeum 
koreensis and Nitrosopumilis maritimus are shaded. 
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encompasses genes encoding all the essential compo- 
nents of the ubiquitin system [27]. The artubulins add 
another key piece to the puzzle of eukaryote origin. 
Although currently pertaining to a single archaeal line- 
age, this finding has substantial implications given that 
tubulins comprise one of the most abundant, highly 
conserved, essential protein families in eukaryotes. The 
likely origin of eukaryotic tubulins from ancestral forms 
represented in a specific lineage of archaea mimics the 
evolutionary scenario for actins, the second family of the 
major cytoskeletal protein in eukaryotes. The archaeal 
actin homologs, denoted crenactins, which are the prob- 
able ancestors of the eukaryotic actin family [29], are 
represented in Thermoproteales, Korarchaeum cryptofi- 
lum, and C. subterraneum [20,28], and form cytoskeletal 
elements essential for cell division in at least some of 
these Archaea [30]. It appears most likely that in Nitro- 
soarchaeum, which does not encode actin, a similar role 
belongs to artubulin. Taken together, these findings 
reinforce the mosaicism of the archaeal roots of eukar- 
yotes and seem to be best compatible with the hypoth- 
esis that the archaeal ancestor of eukaryotes was a 
highly complex, possibly transient prokaryotic organism 
[31]. 

Conclusions 

We show here that two species of Thaumarchaeota from 
the genus Nitrosoarchaeum encode members of the 
FtsZ/tubulin superfamily that are more closely related to 
eukaryotic tubulins than to any archaeal or bacterial 
homologs; we denote these proteins artubulins. The 
results of phylogenetic analysis are compatible with the 
basal position of the artubulins in the tubulin family and 
hence with the ancestral status of artubulins with 
respect to the eukaryotic tubulins. This finding adds 
more weight to the shaping scenario of the origin of the 
first eukaryotic cells from highly complex, possibly tran- 
sient archaeal forms. 

Methods 

Protein sequences were retrieved from the non-redun- 
dant database at the National Center for Biotechnology 
Information (NIH, Bethesda). A reference set of FtsZ 
proteins was taken from a previous study [13]. Informa- 
tion on highly diverged eukaryotic tubulins was taken 
from [12,32]. The non-redundant protein sequence data- 
base was searched using the PSI-BLAST program [33]. 
Protein sequences were aligned using the MUSCLE pro- 
gram [34]; columns containing a large fraction of gaps 
(greater than 30%) and non-homogenous columns 
defined as described previously [35] were removed from 
the alignment. The resulting 278-column alignment was 
used to construct two maximum likelihood (ML) phylo- 
genetic trees, one using the FastTree program [36] with 



default parameters (JTT evolutionary model, discrete 
gamma model with 20 rate categories) and the other 
using the MOLPHY program [37] with the JTT substi- 
tution matrix to perform local rearrangement of an ori- 
ginal Fitch tree [38]. Phylogenetic tree topology testing 
was performed with the TreeFinder program [39] using 
either expected likelihood weights (ELWs [40]) or the 
approximately unbiased (AU) test P values [41]. An 
unconstrained ML tree was compared with 9 con- 
strained topologies, which were constructed by grouping 
the artubulins with one of the branches major branches 
of tubulins (see Additional File 3). 

Reviewers' reports 

Reviewer 1: Gaspar Jekely, Max Planck Institute for 
Developmental Biology, Tuebingen, Germany 

The identification of the Nitrosoarchaeum tubulins by 
Yutin and Koonin is potentially interesting, and upon 
first read I tended to agree with their conclusion that 
the results are compatible with the origin of eukaryotic 
tubulins from Nitrosoarchaeum tubulins. However, 
upon closer inspection, I found a few potential caveats 
with this interpretation, and I would like to ask the 
authors to address these. Additionally, I would also like 
to suggest a few points that would need further 
clarification. 

The authors write that "Eukaryotic tubulin sequences 
... aligned with these proteins [Nitrosoarchaeum tubu- 
lins] over a region of approximately 300 amino acid 
residues" and that "the similarity between eukaryotic 
tubulins and FtsZ-like proteins ... covered regions of 
approximately 100 amino acid centered at the GTP- 
binding loops". Having performed the blast searches, I 
can confirm these results. However, the statement like 
this is slightly misleading, since it implies that Nitro- 
soarchaeum tubulins are related to eukaryotic tubulins 
across 300 residues, and FtsZ only across 100 residues, 
and that is not true. If one performs psi-blast searches, 
after three iterations it becomes apparent that the 
alignments with Nitrosoarchaeum tubulins and FtsZ 
proteins all cover about 61-66% of the query sequence 
(I used mouse alpha tubulin as query). This section 
should be clarified to indicate that the region of 
homology is not longer between the Nitrosoarchaeum 
sequences and tubulins, than between FtsZ and 
tubulin. 

Authors' response: The text in question pertains to a 
single iteration of BLAST search and the observed differ- 
ences are highlighted to emphasize the greater similarity 
between artubulins and tubulins compared to the tubu- 
lins versus FtsZ. There is no implication that the actual 
homologous domains are of different size. Indeed, it 
should be obvious the GTPase domain is conserved as a 
whole. 
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I would like to point out a caveat about the rooting of 
the tree in Figure 2. The authors chose to root it on 
FtsZ proteins, however, with the same topology, the tree 
could also be rooted on the Nitrosoarchaeum sequences, 
and this would show the FtsZ clade as a sister to eukar- 
yotic tubulins. Alternatively, the root could also be 
placed between eukaryotic sequences and FtsZ + Nitro- 
soarchaeum, These different rootings would dramatically 
affect the conclusions of the paper. The only justifica- 
tion for using FtsZ as a root, I assume, is that in blast 
searches the Nitrosoarchaeum sequences show higher 
similarity to the eukaryotic tubulins. This means, that 
the phylogenetic tree does not constitute independent 
evidence from blast, and therefore does not confirm the 
close relationship between the Nitrosoarchaeum 
sequences and eukaryotic tubulins. 

Authors' response: The rooting of the tree in Figure lis 
justified not so much by sequence similarity but by phy- 
letic distribution of tubulins and FtsZ. Indeed, both 
Nitrosoarchaea, in addition to the artubulins, encode 
FtsZ proteins typical of other Thaumarchaeota. Rooting 
the tree by artubulins would imply an ancient duplica- 
tion with subsequent massive loss of artubulin genes in 
all bacteria and archaea except for two Nitrosoarchaea 
which is an extremely non-parsimonious scenario. 

The good blast score is due to alignments that are 
longer between the eukaryotic tubulins and the Nitro- 
soarchaeum sequences. However, if one looks at the 
extended alignment at the C-terminal side, the similarity 
is really poor, and this similarity is not picked up by 
blast, if only this portion is used. Could this extended 
alignment be due to residue composition or other bias 
(e.g. Nitrosoarchaeum sequences are less derived than 
FtsZ)? There is another disturbing observation. Namely, 
if one blasts with the portion of eukaryotic tubulins that 
are represented in the alignment (e.g. 
El_Musmu58037275), the best hit is to Thermococcus 
FtsZ (2e-08), and not Nitrosoarchaeum, that doesn't 
even show up until a psi-blast iteration is performed. 
Since the argument hinges on the phylogenetic tree, the 
above considerations should inspire extra caution. 

Authors' response: These concerns seem to stem from a 
certain misunderstanding of the way BLAST algorithm 
works. The algorithm extends the initial hit to the extent 
that is statistically justified and halts when further 
extension leads to increased E-values. Therefore a longer 
alignment recovered by BLAST is indeed evidence of 
greater sequence similarity. Spurious extension of an 
alignment due to compositional bias is possible but 
highly unlikely given the composition-based statistical 
corrections implemented in the current version of BLAST 
[42,43]. 

As a minor comment, I suggest that the authors dis- 
cuss in more detail the phylogenetic position of the 



Prosthecobacter tubulins in their tree. In particular, 
since it has been suggested by others that Prosthecobac- 
ter tubulins may by ancestral to all eukaryotic tubulins. 
For example, Pilhofer et al. [5] speculate about a "verti- 
cal evolution" scenario where eukaryotic tubulins 
evolved from the bacterial ones. This may have been 
justified given the poor resolution of their trees, showing 
no clear relationship between Prosthecobacter tubulins 
and any of the eukaryotic tubulin families. The present 
paper shows a tree (the first one to my knowledge) that 
finds strong support for a clade uniting Prosthecobacter 
tubulins with alpha and beta tubulins. This strongly 
argues against the vertical evolution scenario. This 
would be important to discuss, given that these bacterial 
tubulins sometimes feature in arguments about a pur- 
ported evolutionary connection between eukaryotes and 
Planctomycetes-Verrucomicrobia-Chlamydiae bacteria, 
e.g. [44]. 

The relationship of Prosthecobacter tubulins and alpha 
and beta tubulins is not resolved. Others concluded [6] 
that Prosthecobacter tubulins have mosaic sequences 
with intertwining features from both alpha and beta 
tubulin. This analysis and the tree shown in this paper 
are consistent with a scenario where Prosthecobacter 
tubulins arose from an early horizontal gene transfer 
from an ancient tubulin, prior to the duplication of 
alpha and beta. This may be interesting to point out. It 
would also be interesting to see a technical comment on 
why the position of Prosthecobacter is resolved in the 
present tree, but not in previous attempts. Was there a 
difference in methodology? Were the sequence evolution 
models used more realistic in this study? 

Authors' response: Pinpointing the exact reasons 
behind differences in the results of phylogenetic analyses 
is very difficult. We are inclined to believe that the key 
factor is the more representative and balanced species 
sampling behind the trees presented here. 

That said, we have investigated the phylogenetic posi- 
tion of bacterial tubulins in greater details, with the fol- 
lowing conclusions. 

1. Placing the bacterial branches outside the eukaryotic 
tubulin subtree was firmly rejected by the same statisti- 
cal test of tree topology that we did in the paper (All < 
0.01). Thus, we have reasonable confidence that Prosthe- 
cobacterial tubulins are not ancestors to eukaryotic 
tubulins. 

2. Monophyly of bacterial tubulins remains a matter of 
considerable uncertainty. This clade is not strongly sup- 
ported in the tree in Figure 2(bootstrap value of 71 at 
best). Furthermore, per suggestion of reviewer 2, we ran 
ProtTest [39]to select the best substitution matrix which 
in this case turned out to be the LG matrix. Two alter- 
native trees, using RAxML [45]and Treefinder [39], were 
constructed from the same alignment as used for the tree 
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shown in Figure 2A. In both trees, Prosthebacterial tubu- 
lins A and B grouped, respectively, with eukaryotic tubu- 
lins a and P but the respective branches were not 
supported (Additional File 4). In addition, eukaryotic 
and Prosthecobacterial tubulins were realigned without 
artubulins and FtsZ, in order to obtain an extended, 
higher quality alignment, and a tree was constructed 
using TreeFinder (Additional File 4). In this tree, bacter- 
ial tubulin A grouped with ol/k tubulins whereas bacter- 
ial tubulin B grouped with y tubulins, exactly 
reproducing the topology in Figure 6 of Pilhofer et al [5] 
but again with weak support. 

Thus, we can only assert that Prosthecobacterial tubu- 
lins evolved from within the eukaryotic subtree but the 
actual scenario for their evolution remains uncertain. 
However, even this conclusion is sufficient to dismiss 
Prosthecobacterial tubulins as an argument for an evolu- 
tionary connection between the PVC superphylum of 
bacteria and eukaryotes. This connection seems to be 
non-existent as argued in detail elsewhere [46]. 

Reviewer 2: J. Peter Gogarten, University of Connecticut 

The manuscript by Natalya Yutin and Eugene Koonin 
reports an exciting discovery: the presence of tubulin 
encoding genes in Thaumarchaeota. Tubulins are an 
important component of the eukaryotic cytoskeleton. 
The absence of closely related sequences ancestral to 
tubulins in prokaryotes was used to argue for a eukaryo- 
tic stem group that existed for a long period of time as 
a lineage distinct from archaea and bacteria. The argu- 
ment is that a lot of substitutions would be needed to 
evolve tubulin from ftsZ, and that these many substitu- 
tions would be more compatible with a deep origin of 
the eukaryotes (see [47] for discussion). The finding of 
archaeal tubulins weakens this argument: If some 
archaea possess tubulins that branch outside the eukar- 
yotic tubulins, but much closer to the tubulins than to 
FtsZ, then it is conceivable that the eukaryotes branch 
from inside the archaeal domain and inherited the tubu- 
lin from their archaeal ancestor. 

Authors' response: Indeed, the discovery of artubulins 
seems to invalidate the use of the distant relationship 
between tubulins and FtsZ as an argument for a eukar- 
yotic stem outside Archaea. Along with other recent 
observations, e.g. [20], these findings seem to be best 
compatible with the origin of eukaryotes from a highly 
complex, possibly transiently existing archaeon [29,31]. 

The archaeal tubulin sequences presented and dis- 
cussed by Yutin and Koonin appear more similar to the 
eukaryotic tubulins than to FtsZ. This observation is 
confirmed by their phylogenetic analysis. The authors 
discuss their findings with appropriate caution, and I 
don't think that more sophisticated analyses will change 
the findings; nevertheless, the following two concerns 



seem worthy of consideration: First, the phylogenetic 
reconstruction does not appear to consider among site 
rate variation (ASRV), i.e., the choice of model used in 
phylogenetic reconstruction using maximum likelihood 
is not well described and justified. If one were to incor- 
porate ASRV, I expect the deep branches of the phylo- 
geny to become longer, because multiple substitutions 
are more efficiently corrected for, and the distinction 
between tubulins (including the archaeal tubulins) and 
FtsZs becomes stronger, strengthening the authors con- 
clusion that these are indeed tubulins. Nevertheless, the 
choice of model should be discussed, and could be 
improved. 

Authors' response: We applied ProtTest and found LG 
to be the optimal matrix; accordingly, two alternative 
trees were built using LG (Additional File 4). The topolo- 
gies of these trees differ in many places from the topology 
of the tree in Figure 2but these differences do not affect 
the conclusions of this work (see also the response to 
Reviewer I regarding the bacterial tubulins). 

Second, aligning divergent sequences is difficult, and 
the alignment itself can create a strong phylogenetic 
bias. This is a concern, because the archaeal tubulins 
and the FtsZ sequences are very divergent. How certain 
can we be that these archaeal sequences group outside 
the eukaryotic domain, as one would expect if archaeal 
tubulins were ancestral to the eukaryotic ones, and not 
inside the eukaryotic domain, as one would expect if the 
Thaumarchaeota acquired the tubulins from a eukaryote 
through horizontal gene transfer. An analysis that simul- 
taneously considers phylogeny and alignment, such as 
SATe [48], might help to exclude the possibility of a 
eukaryote to archaeon transfer with more confidence. 
However, the best approach to address this uncertainty 
will be additional archaeal tubulin sequences, which 
hopefully will become available in the future. 

Authors' response: SATe is beyond doubt an attractive 
phylogenetic approach but one that has not been suffi- 
ciently tested on phylogenies including distantly related, 
real sequences. We fully agree with the reviewer that the 
primary advance is likely to be brought about by further 
sampling of diverse archaea that is expected to reveal a 
greater diversity of artubulins. 

Minor suggestions: In spelling the species names for 
candidatus species, the convention is to italicize the 
word Candidatus, and to leave the suggested species 
name in normal font, e.g., Candidatus Nitrosoarchaeum 
koreensis. As no members of the genus have been culti- 
vated, the Candidatus should also be used for the genus 
(e.g., the corresponding line in the abstract should read: 
"... genus Candidatus Nitrosoarchaeum that we denote 
artubulins. Phylogenetic ..." - also, the period was miss- 
ing after artubulins). 1. Fournier GP, Dick AA, Williams 
D, Gogarten JP (2011) Evolution of the Archaea: 
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emerging views on origins and phylogeny. Research in 
microbiology 162: 92-98. 2. Liu K, Raghavan S, Nelesen 
S, Linder CR, Warnow T (2009) Rapid and accurate 
large-scale coestimation of sequence alignments and 
phylogenetic trees. Science 324: 1561-1564. 
Authors' response: Corrected, 

Additional material 



Acknowledgements 

The authors' research is supported by the US Department of Health and 
Human Services intramural funds (to National Library of Medicine). 

Authors' contributions 

NY collected the data; NY and EVK analyzed the data; EVK wrote the 
manuscript which was read and approved by both authors. 

Competing interests 

The authors declare that they have no competing interests. 

Received: 22 December 2011 Accepted: 29 March 2012 
Published: 29 March 2012 

References 

1. Nogales E: Structural insights into microtubule function. Annu Rev 
Biochem 2000, 69:277-302. 

2. Wade RH: On and around microtubules: an overview. Mol Biotechnol 
2009, 43(2):177-191. 

3. Dutcher SK: Long-lost relatives reappear: identification of new members 

of the tubulin superfamily. Curr Opin Microbiol 2003, 6(6):634-640. 

4. Jenkins C, Samudrala R, Anderson I, Hedlund BP, Petroni G, Michailova N, 
Pinel N, Overbeek R, Rosati G, Staley JT: Genes for the cytoskeletal protein 
tubulin in the bacterial genus Prosthecobacter. Proc Natl Acad Sci USA 
2002, 99(26):1 7049-1 7054. 

5. Pilhofer M, Ladinsky MS, McDowall AW, Petroni G, Jensen GJ: Microtubules 
in bacteria: ancient tubulins build a five-protofi lament homolog of the 
eukaryotic cytoskeleton. PLoS Biol 201 1, 9(1 2):e1 001 21 3. 

6. Martin-Galiano AJ, Oliva MA, Sanz L, Bhattacharyya A, Serna M, Yebenes H, 
Valpuesta JM, Andreu JM: Bacterial tubulin distinct loop sequences and 
primitive assembly properties support its origin from a eukaryotic 
tubulin ancestor. J Biol Chem 201 1, 286(22):1 9789-1 9803. 

7. Schlieper D, Oliva MA, Andreu JM, Lowe J: Structure of bacterial tubulin 
BtubA/B: evidence for horizontal gene transfer. Proc Natl Acad Sci USA 
2005, 102(26):91 70-91 75. 

8. Erickson HP, Anderson DE, Osawa M: FtsZ in bacterial cytokinesis: 
cytoskeleton and force generator all in one. Microbiol Mol Biol Rev 2010, 
74(4):504-528. 

9. Aylett CH, Lowe J, Amos LA: New insights into the mechanisms of 
cytomotive actin and tubulin filaments. Int Rev Cell Mol Biol 201 1, 

292:1-71. 

10. Ingerson-Mahar M, Gitai Z: A growing family: the expanding universe of 
the bacterial cytoskeleton. FEMS Microbiol Rev 2012, 36(1):256-267. 

11. Nogales E, Downing KH, Amos LA, Lowe J: Tubulin and FtsZ form a 
distinct family of GTPases. Nat Struct Biol 1998, 5(6):45 1-458. 



12. Libusova L, Draber P: Multiple tubulin forms in ciliated protozoan 
Tetrahymena and Paramecium species. Protoplasma 2006, 227(2-4):65-76. 

13. Makarova KS, Koonin EV: Two new families of the FtsZ-tubulin protein 
superfamily implicated in membrane remodeling in diverse bacteria and 
archaea. Biol Direct 2010, 5:33. 

14. Erickson HP: Evolution of the cytoskeleton. Bioessays 2007, 29(7):668-677. 

15. Blainey PC, Mosier AC, Potanina A, Francis CA, Quake SR: Genome of a low- 
salinity ammonia-oxidizing archaeon determined by single-cell and 
metagenomic analysis. PLoS One 2011, 6(2):e16626. 

16. Kim BK, Jung MY, Yu DS, Park SJ, Oh TK, Rhee SK, Kim JF: Genome 
sequence of an ammonia-oxidizing soil archaeon, 'Candidatus 
Nitrosoarchaeum koreensis' MY1. J Bacteriol 201 1, 1 93(1 9):5539-5540. 

17. Henne WM, Buchkovich NJ, Emr SD: The ESCRT pathway. Dev Cell 201 1, 
21(1):77-91. 

18. Samson RY, Obita T, Freund SM, Williams RL, Bell SD: A role for the ESCRT 
system in cell division in archaea. Science 2008, 322(5908):1710-1713. 

19. Samson RY, Bell SD: Ancient ESCRTs and the evolution of binary fission. 
Trends Microbiol 2009, 17(11)507-51 3. 

20. Makarova KS, Yutin N, Bell SD, Koonin EV: Evolution of diverse cell division 
and vesicle formation systems in Archaea. Nat Rev Microbiol 2010, 
8(10)731-741. 

21. Lindas AC, Karlsson EA, Lindgren MT, Ettema TJ, Bernander R: A unique cell 
division machinery in the Archaea. Proc Natl Acad Sci USA 2008, 
105(48):1 8942-1 8946. 

22. Ettema TJ, Bernander R: Cell division and the ESCRT complex: a surprise 
from the archaea. Commun Integr Biol 2009, 2(2):86-88. 

23. Pelve EA, Lindas AC, Martens-Habbena W, de la Torre JR, Stahl DA, 
Bernander R: Cdv-based cell division and cell cycle organization in the 
thaumarchaeon Nitrosopumilus maritimus. Mol Microbiol 2011, 
82(3):555-566. 

24. Pilhofer M, Rosati G, Ludwig W, Schleifer KH, Petroni G: Coexistence of 
tubulins and ftsZ in different Prosthecobacter species. Mol Biol Evol 2007, 
24(7):1 439-1 442. 

25. TahirovTH, Makarova KS, Rogozin IB, Pavlov Yl, Koonin EV: Evolution of 
DNA polymerases: an inactivated polymerase-exonuclease module in Pol 
epsilon and a chimeric origin of eukaryotic polymerases from two 
classes of archaeal ancestors. Biol Direct 2009, 4:1 1. 

26. Blombach F, Makarova KS, Marrero J, Siebers B, Koonin EV, van der Oost J: 
Identification of an ortholog of the eukaryotic RNA polymerase III 
subunit RPC34 in Crenarchaeota and Thaumarchaeota suggests 
specialization of RNA polymerases for coding and non-coding RNAs in 
Archaea. Biol Direct 2009, 4:39. 

27. Nunoura T, Takaki Y, Kakuta J, Nishi S, Sugahara J, Kazama H, Chee GJ, 
Hattori M, Kanai A, Atomi H, et al: Insights into the evolution of Archaea 
and eukaryotic protein modifier systems revealed by the genome of a 
novel archaeal group. Nucleic Acids Res 2011, 39(8):3204-3223. 

28. Guy L, Ettema TJ: The archaeal TACK' superphylum and the origin of 
eukaryotes. Trends Microbiol 201 1, 19(12):580-587. 

29. Yutin N, Wolf MY, Wolf Yl, Koonin EV: The origins of phagocytosis and 
eukaryogenesis. Biol Direct 2009, 4:9. 

30. Ettema TJ, Lindas AC, Bernander R: An actin-based cytoskeleton in 
archaea. Mol Microbiol 201 1, 80(4): 1 052-1 061. 

31. Koonin EV: The Logic of Chance: The Nature and Origin of Biological Evolution 
Upper Saddle River NJ: FT press; 201 1. 

32. Dutcher SK: The tubulin fraternity: alpha to eta. Curr Opin Cell Biol 2001, 
13(1):49-54. 

33. Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W, 
Lipman DJ: Gapped BLAST and PSI-BLAST: a new generation of protein 
database search programs. Nucleic Acids Res 1997, 25(17)3389-3402. 

34. Edgar RC: MUSCLE: multiple sequence alignment with high accuracy and 
high throughput. Nucleic Acids Res 2004, 32(5):1 792-1 797. 

35. Yutin N, Makarova KS, Mekhedov SL, Wolf Yl, Koonin EV: The deep archaeal 
roots of eukaryotes. Mol Biol Evol 2008, 25(8):1 61 9-1 630. 

36. Price MN, Dehal PS, Arkin AP: FastTree 2-approximately maximum- 
likelihood trees for large alignments. PLoS One 2010, 5(3):e9490. 

37. Adachi J, Hasegawa M: MOLPHY: Programs for Molecular Phylogenetics Tokyo: 
Institute of Statistical Mathematics; 1992. 

38. Fitch WM, Margoliash E: Construction of phylogenetic trees. Science 1967, 
155(760):279-284. 



Additional file 1: Multiple alignment of Tubulin/FtsZ family 
proteins. 

Additional file 2: Filtered multiple alignment of Tubulin/FtsZ family 
proteins used for tree construction. 

Additional file 3: Statistical tests on the topology of the 
phylogenetic tree of the tubulin/FtsZ superfamily. 

Additional file 4: Additional phylogenetic trees constructed using 
the RAxML and TreeFinder methods. 



Yutin and Koonin Biology Direct 2012, 7:10 
http://www.biology-direct.eom/content/7/1 /1 0 



Page 9 of 9 



39. Jobb G, von Haeseler A, Strimmer K: TREEFINDER: a powerful graphical 
analysis environment for molecular phylogenetics. BMC Evol Biol 2004, 

4:18. 

40. Strimmer K, Rambaut A: Inferring confidence sets of possibly misspecified 
gene trees. Proc Biol Sci 2002, 269(1 487):1 37-1 42. 

41. Shimodaira H: An approximately unbiased test of phylogenetic tree 
selection. Syst Biol 2002, 51 (3):492-508. 

42. Altschul SF, Wootton JC, Gertz EM, Agarwala R, Morgulis A, Schaffer AA, 
Yu YK: Protein database searches using compositionally adjusted 
substitution matrices. FEBS J 2005, 272(20):51 01 -51 09. 

43. Yu YK, Gertz EM, Agarwala R, Schaffer AA, Altschul SF: Retrieval accuracy, 
statistical significance and compositional similarity in protein sequence 
database searches. Nucleic Acids Res 2006, 34(20):5966-5973. 

44. Reynaud EG, Devos DP: Transitional forms between the three domains of 
life and evolutionary implications. Proc Biol Sci 201 1, 278(1723)3321-3328. 

45. Stamatakis A, Ludwig T, Meier H: RAxML-lll: a fast program for maximum 
likelihood-based inference of large phylogenetic trees. Bioinformatics 
2005, 21(4):456-463. 

46. Mclnerney JO, Martin WF, Koonin EV, Allen JF, Galperin MY, Lane N, 
Archibald JM, Embley TM: Planctomycetes and eukaryotes: a case of 
analogy not homology. Bioessays 2011, 33(1 1 ):81 0-81 7. 

47. Fournier GP, Dick AA, Williams D, Gogarten JP: Evolution of the Archaea: 
emerging views on origins and phylogeny. Res Microbiol 201 1, 
162(1):92-98. 

48. Liu K, Raghavan S, Nelesen S, Linder CR, Warnow T: Rapid and accurate 
large-scale coestimation of sequence alignments and phylogenetic 
trees. Science 2009, 324(5934):! 561 -1564. 



doi:1 0.1 186/1 745-6150-7-10 

Cite this article as: Yutin and Koonin: Archaeal origin of tubulin. Biology 
Direct 2012 7:10. 



Submit your next manuscript to BioMed Central 
and take full advantage of: 

• Convenient online submission 

• Thorough peer review 

• No space constraints or color figure charges 

• Immediate publication on acceptance 

• Inclusion in PubMed, CAS, Scopus and Google Scholar 

• Research which is freely available for redistribution 

Submit your manuscript at (^\ RioMM i r pnt ^\ 

www.biomedcentral.com/submit Blomea ^ enirai 



